Multistream Diarization Fusion Using the Minimum Variance Bayesian Information Criterion

نویسندگان

  • Tae Jin Park
  • Panayiotis Georgiou
چکیده

Speaker diarization is necessary with ubiquitous and individualized recorders. We focus on the specific task of speaker diarization from two information streams, two microphones, assigned to two participants of interest. In real scenarios, speakers may be co-located, in noisy environments with interfering speakers. Multistream diarization can exploit additional information and diarization fusion is necessary. In this work we first introduce a new database that realistically simulates a range of extremely challenging acoustic conditions; and propose a Minimum Variance of BIC (MVBIC) method to combine information from the various diarization streams. We use a 2microphone subset of our proposed database and Root Mean Square Energy (RMSE) and Mel Frequency Cepstral Coefficients (MFCC) as our two diarization streams to validate the proposed method. We show that our proposed method exploits the complementarity of the individual diarization streams and outperforms static fusion mixing weights. We also demonstrate the robustness of the MVBIC method on RT-06S data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speaker Diarization Using Gaussian Mixture Turns and Segment Matching

Speaker diarization aims to detect “who spoke when” in large audio segments. It is an important task in processing of broadcast news audio, making easier the audio segments selection and indexing task. In this paper an unsupervised speaker diarization scheme is proposed using a Gaussian Mixture Model as a Universal Background Model, Bayesian Information Criterion and fingerprint detection. A de...

متن کامل

Improving Speaker Diarization

This paper describes the LIMSI speaker diarization system used in the RT-04F evaluation. The RT-04F system builds upon the LIMSI baseline data partitioner, which is used in the broadcast news transcription system. This partitioner provides a high cluster purity but has a tendency to split the data from a speaker into several clusters when there is a large quantity of data for the speaker. In th...

متن کامل

Speaker Diarization: From Broadcast News to Lectures

This paper presents the LIMSI speaker diarization system for lecture data, in the framework of the Rich Transcription 2006 Spring (RT-06S) meeting recognition evaluation. This system builds upon the baseline diarization system designed for broadcast news data. The baseline system combines agglomerative clustering based on Bayesian information criterion with a second clustering using state-of-th...

متن کامل

An improved speaker diarization system

This paper describes an automatic speaker diarization system for natural, multi-speaker meeting conversations. Only one central microphone is used to record the meeting. The new system is robust to different acoustic environments it requires neither pre-training models nor development sets to initialize the parameters. The new system determines the model complexity automatically. It adapts the ...

متن کامل

Investigating Various Diarization Algorithms for Speaker in the Wild (SITW) Speaker Recognition Challenge

Collecting training data for real-world text-independent speaker recognition is challenging. In practice, utterances for a specific speaker are often mixed with many other acoustic signals. To guarantee the recognition performance, the segments spoken by target speakers should be precisely picked out. An automatic detection could be developed to reduce the cost of expensive human hand-made anno...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2018